Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-6109

Avoid extra loops when optimizing statements with ternary expressions

    XMLWordPrintableJSON

Details

    Description

      // org.apache.calcite.linq4j.tree.BlockBuilder#toBlock
      public BlockStatement toBlock() {
        if (optimizing && removeUnused) {
          // We put an artificial limit of 10 iterations just to prevent an endless
          // loop. Optimize should not loop forever, however it is hard to prove if
          // it always finishes in reasonable time.
          for (int i = 0; i < 10; i++) {
            if (!optimize(createOptimizeShuttle(), true)) {
              break;
            }
          }
          optimize(createFinishingOptimizeShuttle(), false);
        }
        return Expressions.block(statements);
      } 

      The above code comes from the org.apache.calcite.linq4j.tree.BlockBuilder class in the Calcite linq4j module.

      1 What is the problem?

      The problem is that for statements with ternary expressions, the for loop in the above code will always execute until the loop is exhausted, although it may not have done any optimization.

      2 How to reproduce this problem?

      We can reproduce the issue in the following ways.

      1. Add some statements with ternary expressions to BlockBuilder through the BlockBuilder#add().
      2. Call the BlockBuilder#toBlock() method.
      3. Observe the for loop in the BlockBuilder#toBlock() method, which is always executed 10 times.

      3 Why does this problem occur?

      The reason is that when OptimizeShuttle traverses the statement, a new instance of TernaryExpression will always be created, regardless of whether the optimization is actually performed.

      This makes BlockBuilder mistakenly believe that this optimization is effective and start the next optimization.

      4 What impact will this issue have?

      This is a performance issue that’s hard to ignore.

      return a != 1 ? b : c; 

      With a simple line of code like the above, the for loop in the org.apache.calcite.linq4j.tree.BlockBuilder#toBlock method will also be executed 10 times.

      If there are hundreds or thousands of statements in BlockBuilder#statements, this impact cannot be ignored.

      Attachments

        Issue Links

          Activity

            People

              xinqiu asdfgh19
              xinqiu asdfgh19
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: